CUBEFREE WORDS WITH MANY SQUARES 



JAMES CURRIE AND NARAD RAMPERSAD 

Abstract. Wc construct infinite cubefree binary words containing exponentially many 
distinct squares of length n. We also show that for every positive integer n, there is a 
cubefree binary square of length 2n. 



1. Introduction 

A square is a non-empty word of the form xx, and a cube is a non-empty word of the form 
XXX. An overlap is a word of the form axaxa, where a is a letter and x is a word (possibly 
empty). A word is squarefree (resp. cubefree, overlap-free) if none of its factors are squares 
(resp. cubes, overlaps). For further background material concerning combinatorics on words 
we refer the reader to [2]. 

It is well-known that there exist infinite squarefree words over a ternary alphabet and 
infinite overlap- free words over a binary alphabet. Clearly, any overlap-free word is also 
cubefree. Any infinite cubefree binary word must contain squares; however, Dekking [8] 
proved that there exists an infinite cubefree binary word containing no squares xx where 
the length of x is greater than 3 (see also [131 [II])- In this paper we consider instead the 
existence of infinite cubefree binary words with many distinct squares. 

Most known constructions of infinite cubefree words involve the iteration of a morphism. 
Words constructed in this manner are often refered to as infinite DOL words. Ehrenfeucht 
and Rozenberg |9l |10[ |TT] proved several results concerning the factor complexity of infinite 
DOL words. They showed that any squarefree or cubefree DOL word has 0{n\ogn) factors 
of length n. Thus, an infinite cubefree DOL word cannot have many distinct square fac- 
tors. By constrast, we show here how to construct infinite cubefree binary words containing 
exponentially many distinct squares of length n. 

Other work related to the problems considered here include [H El El- 

Let /i denote the Thue-Morse morphism: i.e., the morphism that maps ^ 01 and 
1 10. The Thue-Morse word is the infinite word 

t = 011010011001011010010110 ■ • • 

obtained by iteratively applying ^ to the word 0. The Thue-Morse word is well-known to 
be overlap-free, and hence, a fortiori, cubefree The squares occurring in the Thue- 

Morse word were characterized by Pansiot [12] and Brlek [1] as follows. Define sets A = 
{00,11,010010,101101} and 

A=[jfi\A). 

k>0 
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The set A is the set of squares appearing in the Thue-Morse word. 

Shelton and Soni [15] characterized the overlap-free squares (the result is also attributed 
to Thue by Berstel [3]), as being the conjugates of the words in A. (A conjugate of x is a 
word y such that x = uv and y = vu for some u,v.) Currie and Rampersad [S] showed that 
the conjugates of the words in A are also precisely the 7/3-power-free squares. Thus, there 
are only 7/3-power-free squares of length 2n when n is a power of 2, or 3 times a power 
of 2. By contrast, we show that there are cubefree binary squares of length 2n for every 
positive integer n. We use this result to construct infinite cubefree binary words containing 
exponentially many distinct squares. 

2. Main results 

The main results of this paper are the following two theorems. 

Theorem 1. Let n be a positive integer. There exists a cubefree binary square of length 2n. 

Theorem 2. There exists an infinite cubefree binary word containing exponentially many 
distinct squares of length n. 

We first establish some preliminary results. 

Lemma 3. The Thue-Morse word contains a factor of the form x = lOOlx" = x'lOOl of 
every positive even length n ^ 2,6. 

Proof. Aberkane and Currie [U Lemma 4] proved that for every integer m > 6, the Thue- 
Morse word contains a factor of length m of the form lOylO. Then the Thue-Morse word 
also contains the factor /^(lOylO) = 1001/i(?/)1001, which has length 2m. Finally, we observe 
that 10011001 and 1001101001 are factors of the Thue-Morse word of lengths 8 and 10 
respectively. □ 

Lemma 4. Ify is overlap-free and ayb is a cube of period p, then p < \ab\. 

Proof. Otherwise deleting a and b removes less than a full period from ayb, leaving an 
overlap. □ 

Lemma 5. If z is a factor ofyyy where \y\= p and \ z\ < p+1, then there are two occurrences 
of z in yyy. 

Proof. Certainly if z is a factor of yy it occurs twice in yyy. If z is a factor of yyy but not 
of yy, then z must span the central y of yyy and a bit more on both ends, giving z a length 
of p + 2 or more. □ 

Theorem 6. Let x be a factor of the Thue-Morse word of the form x = lOOlx" = x'lOOl. 
Then the word xOxOis cubefree. 

Remark 1. Word 01010 occurs exactly once in xOxO. (Note that this word is an overlap, and 
hence not a factor of the Thue-Morse word.) 

Proof of Theorem\^ Suppose yyy is a cube in xOxO with \y\ = p > 0. 

Case 1: Period p > 4. By Lemma [5] and Remark [H word 01010 is not a factor of yyy. We 
have two possibilities: 

Case la: Cube yyy is a factor o/x'lOOlOl. This is impossible by LemmaHJ since x'lOOl 
is overlap-free, |01| = 2, and p > 4 > 2. 
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Case lb: Cube yyy is a factor o/101001x"0. This is again impossible by LemmaHl since 
lOOlx" is overlap-free. 

Case 2: Period p < 3. If 01010 is a factor of yyy, then one of 001010 and 010100 is a factor. 
However, neither of these has period 1, 2 or 3; this is impossible. We conclude that 01010 is 
not a factor of yyy. This gives a similar case breakdown as in Case 1. 
Case 2a: Cube yyy is a factor o/x'lOOlOl. 

Case 2ai: Cube yyy is a suffix o/x'lOOlOl. In this case, p < 2 by Lemma HJ since 
x'lOOl is overlap-free. However, the longest suffix of x'lOOlOl of period 1 or 2 is 0101, which 
is cubefree. 

Case 2aii: Cube yyy is a suffix of x'lOOlO. This forces p = 1, which is impossible. 
Case 2b: Cube yyy is a factor o/ 101001a;"0. 

Case 2bi: Cube yyy is a prefix o/101001a;"0 or o/01001x"0. Since \yyy\ = Sp < 9 < 
|01001a;"|, yyy is a factor of 101001a;". This is symmetrical to Case 2a. 

Case 2bii: Cube yyy is a factor of 1001a;"0 = xO. This is impossible by Case 2a. □ 

Theorem 7. Let x be a factor of the Thue-Morse word of the form x = lOOlx" = x'lOOl. 
Then the word xlOllOOxlOllOO is cubefree. 

Remark 2. Word 00100 occurs exactly once in xlOllOOxlOllOO. Word 11011 occurs exactly 
twice. 

Proof of Theorem^ Suppose yyy is a cube in xlOllOOxlOllOO with \y\ = p > 0. 

Case 1: Period p > 4. By Lemma [5] and Remark [21 word 00100 is not a factor of yyy. We 

have two possibilities: 

Case la: Cube yyy is a factor o/xlOllOOlO. Word xlOllOOlO contains 11011 as a factor 
exactly once. By Lemma [5] and Remark [2l there are two possibilities: 

Case lai: Cube yyy is contained in xlOl. In this case, p < 3 hj LemmaHl since x is 
overlap-free. This is a contradiction. 

Case laii: Cube yyy is contained in 10110010. This is clearly impossible. 

Case lb: Cube yyy is a factor o/ 0x101100. Again, word 0x101100 contains 11011 as 
a factor exactly once. Therefore, either yyy is contained in 101100 or in 0x101. The first 
alternative evidently is impossible, while the second is ruled out by LemmaHl 
Case 2: Period p < 3. If 00100 is a factor of yyy, then we must have p = 3, since 00100 
does not have period 1 or 2. However, in xlOllOOxlOllOO, the maximal factor of period 3 
containing 00100 is 1001001, which is not a cube. We conclude that 00100 is not a factor of 
yyy. This gives a similar case breakdown to Case 1: 

Case 2a: Cube yyy is a factor o/xlOllOOlO. By LemmaHlthe word xlO must be cubefree. 
Therefore, yyy must be a suffix of one of these words: 





= x'lOOllOllOOlO 


Wj 


= x'lOOllOllOOl 


Wq 


= x'lOOllOllOO 


W5 


= x'lOOllOllO 


W4 


= x'lOOllOll 


W3 


= x'lOOllOl 
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None of the w„ ends in a cube of period 1, 2 or 3. (In the case of words w^, w^, the longest 
suffixes of period 3 have lengths 6 and 5 respectively.) It follows that yyy is not a suffix of 
any of the Wn, and this case does not occur. 

Case 2b: Cube yyy is a factor of 0x101100. Since \yyy\ = Sp < 9 < \0x\, yyy is a factor 
of Ox or of xlOllOO. The first possibility was ruled out in Theorem 0, and the second in 
Case 2a. □ 

Theorems [6] and [7] together establish Theorem[TJ Next we show that the number of cubefree 
binary squares of length n grows exponentially. 

Proposition 8. There exist exponentially many cubefree binary squares of length n. 

Proof. Let m be a positive integer and let xx be a cubefree binary square of length 2m over 
{0, 1}. Suppose that occurs at least as often as 1 in x. Construct a new cubefree square 
yy over {0, 1, 2}, where y is obtained from x by arbitrarily replacing some of the O's in x by 
2's. There are at least 2™/^ such squares yy of length 2m. 
Let h be the morphism 

^ 001011 

1 001101 

2 011001. 

Brandenburg [5, Theorem 6] showed that h maps cubefree words to cubefree words. More- 
over, since h is uniform and injective, the set of words h{yy) consists of at least 2"*/^ cubefree 
squares of length 12m. Asymptotically, we thus have exponentially many cubefree binary 
squares of length n, as required. □ 

We now prove Theorem [2l 

Proof of Theorem\^ In the proof of Proposition [8] we showed that there are at least 2"^/^ 
cubefree binary squares of length 12m for every positive integer m. Let S therefore be any 
set of cubefree squares over {0, 1} where S contains at least 2*"/^ words of length 12m for 
every positive integer m. Let x = xiX2 ■ ■ ■ be any infinite cubefree binary word over {2, 3}. 
Construct a word 

W = Xi5'iX25'2 ■ ■ ■ , 

where the set of Si's is equal to the set S, so that w is cubefree and contains exponentially 
many distinct squares of length n. Let g be the morphism 

^ 001001101 

1 001010011 

2 ^ 001101011 

3 011001011. 

Brandenburg [51 Theorem 6] showed that g maps cubefree words to cubefree words. Thus, 
g{w) is cubefree and, by the uniformity and injectivity of g, contains exponentially many 
distinct squares of length n. □ 

Note that Theorem [2] implies that existence of an infinite cubefree binary word with 
exponential factor complexity — i.e., with exponentially many factors of length n. Similarly, 
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one can easily construct an infinite squarefree word over {0, 1, 2} with exponential factor 
complexity. 

Proposition 9. There exists an infinite squarefree word over {0, 1, 2} with exponential factor 
complexity. 

Proof. Let w be any infinite squarefree word over {0,1,2} and let x be any infinite word 
over {3, 4} with 2" factors of length n for every positive n. Let y be the word obtained 
by forming the perfect shuffle of w and x: that is, if w = woWiW2 ■ ■ ■ and x = xqXiX2 • • ■ , 
then define y = wqXqWiXiW2X2 ■ ■ ■ ■ Clearly, y is a squarefree word with exponential factor 
complexity. Let / be the morphism 

010201202101210212 

1 010201202102010212 

2 ^ 010201202120121012 

3 ^ 010201210201021012 

4 ^ 010201210212021012. 

Brandenburg [5l Theorem 4] showed that / maps squarefree words to squarefree words. The 
uniformity and injectivity of / implies that /(y) is a squarefree word with exponential factor 
complexity, as required. □ 
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